ITERATE: a conceptual clustering algorithm for data mining

نویسندگان

Gautam Biswas

Jerry B. Weinberg

Douglas H. Fisher

چکیده

|The data exploration task can be divided into three interrelated subtasks: (i) feature selection, (ii) discovery, and (iii) interpretation. This paper describes an unsupervised discovery method with biases geared toward partitioning objects into clusters that improve interpretability. The algorithm, ITERATE, employs: (i) a data ordering scheme and (ii) an iterative redistribution operator to produce maximally cohesive and distinct clusters. Cohesion or intra-class similarity is measured in terms of the match between individual objects and their assigned cluster prototype. Distinctness or inter-class dissimilarity is measured by an average of the variance of the distribution match between clusters. We demonstrate that interpretability, from a problem solving viewpoint, is addressed by the intraand interclass measures. Empirical results demonstrate the properties of the discovery algorithm, and its applications to problem solving. Keywords|knowledge discovery, data mining, conceptual clustering, concept formation, criterion function, order bias, iterative redistribution.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Improved SSPCO Optimization Algorithm for Solve of the Clustering Problem

Swarm Intelligence (SI) is an innovative artificial intelligence technique for solving complex optimization problems. Data clustering is the process of grouping data into a number of clusters. The goal of data clustering is to make the data in the same cluster share a high degree of similarity while being very dissimilar to data from other clusters. Clustering algorithms have been applied to a ...

متن کامل

An Improved SSPCO Optimization Algorithm for Solve of the Clustering Problem

متن کامل

Persistent K-Means: Stable Data Clustering Algorithm Based on K-Means Algorithm

Identifying clusters or clustering is an important aspect of data analysis. It is the task of grouping a set of objects in such a way those objects in the same group/cluster are more similar in some sense or another. It is a main task of exploratory data mining, and a common technique for statistical data analysis This paper proposed an improved version of K-Means algorithm, namely Persistent K...

متن کامل

Improved Automatic Clustering Using a Multi-Objective Evolutionary Algorithm With New Validity measure and application to Credit Scoring

In data mining, clustering is one of the important issues for separation and classification with groups like unsupervised data. In this paper, an attempt has been made to improve and optimize the application of clustering heuristic methods such as Genetic, PSO algorithm, Artificial bee colony algorithm, Harmony Search algorithm and Differential Evolution on the unlabeled data of an Iranian bank...

متن کامل

Solving Data Clustering Problems using Chaos Embedded Cat Swarm Optimization

In this paper, a new method is proposed for solving the data clustering problem using Cat Swarm Optimization (CSO) algorithm based on chaotic behavior. The problem of data clustering is an important section in the field of the data mining, which has always been noted by researchers and experts in data mining for its numerous applications in solving real-world problems. The CSO algorithm is one ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

IEEE Trans. Systems, Man, and Cybernetics, Part C

دوره 28 شماره

صفحات -

تاریخ انتشار 1998

ITERATE: a conceptual clustering algorithm for data mining

نویسندگان

چکیده

منابع مشابه

An Improved SSPCO Optimization Algorithm for Solve of the Clustering Problem

An Improved SSPCO Optimization Algorithm for Solve of the Clustering Problem

Persistent K-Means: Stable Data Clustering Algorithm Based on K-Means Algorithm

Improved Automatic Clustering Using a Multi-Objective Evolutionary Algorithm With New Validity measure and application to Credit Scoring

Solving Data Clustering Problems using Chaos Embedded Cat Swarm Optimization

عنوان ژورنال:

اشتراک گذاری